Stair-Climbing Tests or Self-Reported Functional Capacity for Preoperative Pulmonary Risk Assessment in Patients with Known or Suspected COPD—A Prospective Observational Study

Background: This prospective study aims to determine whether preoperative stair-climbing tests (SCT) predict postoperative pulmonary complications (PPC) better than self-reported poor functional capacity (SRPFC) in patients with known or suspected COPD. Methods: A total of 320 patients undergoing scheduled for major non-cardiac surgery, 240 with verified COPD and 80 with GOLD key indicators but disproved COPD, underwent preoperative SRPFC and SCT and were analyzed. Least absolute shrinkage and selection operator (LASSO) regression was used for variable selection. Two multivariable regression models were fitted, the SRPFC model (baseline variables such as sociodemographic, surgical and procedural characteristics, medical preconditions, and GOLD key indicators plus SRPFC) and the SCT model (baseline variables plus SCTPFC). Results: Within all stair-climbing variables, LASSO exclusively selected self-reported poor functional capacity. The cross-validated area under the receiver operating characteristic curve with bias-corrected bootstrapping 95% confidence interval (95% CI) did not differ between the SRPFC and SCT models (0.71; 0.65–0.77 for both models). SRPFC was an independent risk factor (adjusted odds ratio (OR) 5.45; 95% CI 1.04–28.60; p = 0.045 in the SRPFC model) but SCTPFC was not (adjusted OR 3.78; 95% CI 0.87–16.34; p = 0.075 in the SCT model). Conclusions: Our findings indicate that preoperative SRPFC adequately predicts PPC while additional preoperative SCTs are dispensable in patients with known or suspected COPD.


Introduction
More than 320 million patients undergo surgery each year worldwide [1]. It has been recognized that quantifying functional capacity or cardiopulmonary fitness is a pivotal step for preoperative cardiac risk assessment [2][3][4][5][6]. It has been reported that self-reported functional capacity less than two flights of stairs might improve preoperative cardiovascular risk classification [6]. Current guidelines recommend assessing the self-reported ability to climb two flights of stairs in patients referred for intermediate-or high-risk non-cardiac surgery [3].
Postoperative pulmonary complications (PPC) frequently occur after major surgery [7,8] and the prevalence is particularly high in patients with chronic obstruc-2 of 13 tive pulmonary disease (COPD) [9][10][11][12]. There is growing evidence that individuals with COPD undergoing general anesthesia are at high risk for PPC [10,11]. The incidence of PPC is almost doubled in patients with COPD and the assumed underlying mechanism might be impaired mucociliary clearance of aspirated bacteria and impaired gas exchange [11][12][13]. However, in contrast to cardiovascular risk assessment, only very limited data exist regarding the ability of poor functional capacity to predict PPC after non-thoracic surgery [7,[14][15][16][17][18][19][20]. While the prevalence of COPD is rising in the ageing society, COPD is still underdiagnosed and undertreated [21,22]. Assessment of functional capacity, for example with a stair-climbing test or six-minute walk test, is standard of care for staging and monitoring of disease progression in COPD [23,24]. Further, stair-climbing tests have a proven benefit for prediction of PPC after lung resection [25][26][27][28][29][30] and it has been proposed that they might play a relevant role for the prediction of PPC in patients undergoing non-thoracic surgery as well [7,[14][15][16][17][18][19][20].
However, screening of surgical patients with stair-climbing tests is costly, labor intensive, and time consuming; it requires adequate infrastructural preconditions and health-care resources and may contribute to additional delays [31]. In contrast, self-reported poor functional capacity is easy to assess without additional expenses and efforts as it is already part of the preoperative routine in many regions and institutions [5,6,32]. However, the diagnostic value of self-reported poor functional capacity for the prediction of PPC is unknown.
This prospective study aims to determine whether preoperative stair-climbing tests predict PPC better than self-reported poor functional capacity in patients with known or suspected COPD undergoing major non-cardiac surgery.

Materials and Methods
The Preoperative Diagnostic Tests for Pulmonary Risk Assessment in Chronic Obstructive Pulmonary Disease (PREDICT) study is a prospective observational single-center study conducted in accordance with the Declaration of Helsinki. The study design, conduction, and reporting were carried out in accordance with the TRIPOD statement [33]. The study was registered with Clinical-Trials.gov (NCT02566343) and approved by the Ethics Committee of the Medical Association of Hamburg (PV4743, 5 August 2014). Participants gave written informed consent. The present findings result from an analysis of an independent dataset within the PREDICT study as outlined in the study protocol [10].

Patient Allocation and Data Collection
Adult patients who presented at our anesthesia preoperative assessment clinic between 18 November 2014 and 14 July 2016 were assessed for eligibility. Patients with GOLD key indicators for COPD [23] and a COPD assessment test (CAT TM ) [34] score ≥ 5 points, who were scheduled for major non-cardiac surgery, underwent a structured pulmonary risk stratification including preoperative spirometry. Spirometry was performed before and after bronchodilator application (200 µg salbutamol via a spacer) using a Spirobank G TM spirometer (Medical International Research, Rome, Italy; software: winspiroPRO TM , version 3.2) according to the recommended standards [23,35]. Reference values were taken from a Caucasian population [10,36].
Inclusion criteria were elective major non-cardiac surgery defined as an expected mortality >2% [37] with an anticipated operation duration ≥120 min and/or planned postoperative intensive care unit admission, planned general anesthesia, and an expected postoperative hospital stay of at least three days. Patients without GOLD key indicators (dyspnea, chronic cough, chronic sputum production, recurrent lower respiratory tract infections, or smoking history), patients with tracheostoma or planned tracheostomy, pregnant women, patients <18 years, and patients with an inability or non-compliance for spirometry or for assessment of functional capacity were excluded. Patients with confirmed COPD (forced expiratory volume in 1 s/forced vital capacity ≤0.7) were included in the COPD cohort and compared with a reference cohort that included 80 consecutive patients with GOLD key indicators and a CAT score > 5 points in whom spirometry disproved COPD ( Figure 1). Self-reported poor functional capacity was assessed and a preoperative stair-climbing test was performed in all participators during the preoperative assessment visit of the patient in the preoperative assessment clinic prior to elective surgery. pregnant women, patients <18 years, and patients with an inability or non-compliance for spirometry or for assessment of functional capacity were excluded. Patients with confirmed COPD (forced expiratory volume in 1 s/forced vital capacity ≤0.7) were included in the COPD cohort and compared with a reference cohort that included 80 consecutive patients with GOLD key indicators and a CAT score > 5 points in whom spirometry disproved COPD (Figure 1). Self-reported poor functional capacity was assessed and a preoperative stair-climbing test was performed in all participators during the preoperative assessment visit of the patient in the preoperative assessment clinic prior to elective surgery.

Self-Reported Poor Functional Capacity (SRPFC)
Within a structured routine preoperative interview patients were asked if they were able to climb two flights of stairs. If the answer was 'no', this was documented as selfreported poor functional capacity less than two flights of stairs [6].

Stair-Climbing Test (SCT)
All patients underwent a standardized preoperative stair-climbing test supervised and recorded by a member of the study team (observer). The stair-climbing test was performed in a sparsely frequented stairwell to ensure uniform, reproducible conditions for

Self-Reported Poor Functional Capacity (SR PFC )
Within a structured routine preoperative interview patients were asked if they were able to climb two flights of stairs. If the answer was 'no', this was documented as selfreported poor functional capacity less than two flights of stairs [6].

Stair-Climbing Test (SCT)
All patients underwent a standardized preoperative stair-climbing test supervised and recorded by a member of the study team (observer). The stair-climbing test was performed in a sparsely frequented stairwell to ensure uniform, reproducible conditions for all patients. After at least 10 min of rest, patients were asked to climb up the stairwell with an individualized rapid pace. Patients were instructed not to skip single steps and to only use the railing to keep balance. The observer recorded the time from the start of the stairclimbing test until the patient reached the final landing or terminated the stair-climbing test prematurely due to symptoms (for example dyspnea, exhaustion, dizziness, or chest pain). The observer followed the patient at a little distance in order to avoid making the pace. The steps of the staircase were 17 cm high and 21 cm deep. The test track had a total height of 12.07 m. This stairwell comprised 8 flights of stairs with a landing between each flight that could easily be passed with two or three additional steps.

Outcome Data
All patients were followed-up until hospital discharge and all available outcome data were recorded. The clinical information systems Soarian Health Archive, release 3.04 SP12 (Siemens Healthcare), critical care information management system (ICM, version 8.12, Draeger Medical), and anesthesia charts were systematically screened for adverse events, newly developed clinical signs, new diagnoses, and new radiological findings. Study inclusion was blinded to all health-care professionals in the operation theater, intensive care unit, and normal ward, and intra-and postoperative managements were left to the discretion of the handling physicians.
In addition, a structured postoperative follow-up was conducted between the first and fifth day after extubation.

Sample Size
The approach of Riley and coauthors [38] was used to calculate the required sample size for model development; it uses three criteria: The sample size should ensure an accurate estimate of the overall outcome risk. We assumed a PPC rate of 65% [9,11,39] and a margin of error ≤0.05. We further claimed that the sample size should lead to a shrinkage of predictor effects of 10% and small optimism in the apparent model fit. A Cox-Snell R 2 of 0.8 and 8 candidate predictors were assumed to be appropriate. Based on this assumption, a required sample size of 350 patients was approximated. Assuming a dropout rate of 5%, 365 patients were included in our study, and datasets from 320 patients were analyzed, leading to an acceptable relaxation of the restricting margin of error from 0.05 to 0.052 [38].

Primary Endpoint
The primary endpoint of this analysis was PPC, as defined by the European Perioperative Clinical Outcome (EPCO) definitions, as a composite of respiratory infection, respiratory failure, pleural effusion, bronchospasm, atelectasis, pneumothorax, and/or aspiration pneumonitis [40].

Multivariable Model Development and Performance
To evaluate the incremental diagnostic value of the self-reported poor exercise capacity and the stair-climbing test we fitted two multivariable logistic regression models, the 'SR PFC model' and the 'SCT model', and compared their ability to predict PPC in patients with GOLD key indicators for COPD undergoing major surgery. Complete case analysis was used for both models. Skewed predictors were logarithmized. The results are reported as odds ratios (OR) (95% confidence interval (CI)).
Both models include fixed baseline variables: sociodemographic data, medical preconditions, surgical covariables, and COPD-specific assessments (GOLD key indicators and pack years); these variables were included into the model without any further preselection as they already represent an established assessment bundle [10,23].
STATA's adaptive least absolute shrinkage selector operator (LASSO) regression, which performs multiple LASSO regression analyses, each with cross validation, was used to select eligible variables out of the six 'stair-climbing variables' on top of the fixed baseline variables. This approach was chosen for variable selection and to correct for overfitting by shrinkage. A covariable was considered relevant for prediction if the β-coefficient was not shrunk to zero.
In the next step, we fitted the 'SR PFC model', that incorporates self-reported poor exercise capacity in addition to the baseline variables, and the 'SCT model', that incorporates SCT PFC in addition to the baseline variables.
To determine the predictive performance, the average of the ten-fold cross-validated area under the receiver operating characteristic (ROC) curve (cvAUC) was calculated with bias-corrected bootstrapped (bc-b) 95% CI for both models.

Descriptive Statistics
Sample characteristics are given as absolute and relative frequencies or mean (standard deviation) as well as median (interquartile range), whichever is appropriate. Differences between COPD and non-COPD patients were compared using Fisher's exact test, Student's t-test, or the Mann-Whitney U test, whichever was appropriate. A two-tailed p < 0.05 was considered statistically significant. We report nominal p-values without correction for multiplicity. Statistical analyses were performed using STATA, version 17.0 (StataCorp, College Station, TX, USA).

Results
A total of 31,714 patients were screened and 1271 individuals with GOLD key indicators received spirometry. A total of 365 patients were included in the PREDICT trial and 320 patients (240 with verified COPD and 80 with negative spirometry) were analyzed (12% drop out rate) (Figure 1) [10]. The dataset of this analysis is complete without missing values. Patients with confirmed COPD were grouped into COPD severity classes based on FEV1% values (78 GOLD I, 125 GOLD II, 28 GOLD III, and 9 GOLD IV) [10]. The baseline characteristics of the COPD and reference cohorts are given in Table 1. Patients with confirmed COPD more frequently terminated the stair-climbing test prematurely before reaching the final landing, spent less power, and reached a slower pace than patients in the reference cohort without COPD (Table 1). Of the analyzed patients, 65.6% (210/320) developed PPC, 47.5% in the non-COPD reference cohort and 71.7% in the COPD cohort (p < 0.001).

Multivariable Model Development and Performance
No missing values exist and all 320 analyzable datasets were used for model fitting. STATA's adaptive LASSO regression was applied to select eligible stair-climbing variables on top of the baseline variables and only selected self-reported poor exercise capacity (shrunken β-coefficient 3.39) while the β-coefficients of all other five stair-climbing candidate predictors were shrunk to zero ( Table 2). Two multivariable regression models were fitted, the 'SR PFC model' (baseline variables plus self-reported poor exercise capacity) and the 'SCT model' (baseline variables plus SCT PFC ). Interestingly, while self-reported poor exercise capacity was an independent risk factor in the 'SR PFC model' (adjusted OR 5.45; 95% CI 1.04-28.60; p = 0.045), SCT PFC was not an independent risk factor in the 'SCT model' (adjusted OR 3.78; 95% CI 0.87-16.34; p = 0.075). Sex (male), upper abdominal, and thoracic and mediastinal surgery were independent predictors for PPC in both models (Table 3). The cvAUC (bc-b 95% CI) did not differ between the 'SR PFC model' (0.712; 0.648-0.768) and the 'SCT model' (0.713; 0.650-0.770) (Figure 2).

Figure 2.
Receiver operating characteristic (ROC) curves and ten-fold cross-validated areas under the ROC curve (cvAUC) illustrate the discriminatory capability of two multivariable models to predict postoperative pulmonary complications in patients with known or suspected COPD undergoing major surgery. Abbreviations: bc-b 95% CI: bias-corrected bootstrapped 95% confidence interval; SRPFC: self-reported poor exercise capacity; SCT: stair-climbing test.

Discussion
In this secondary analysis of a prospective observational study in patients undergoing major non-cardiac surgery, we screened a large surgical population with more than 30,000 patients to identify high-risk individuals. Exercise testing was performed in 365 individuals with known or suspected COPD.
Interestingly, our analysis demonstrates that actually performing a stair-climbing test does not translate into a better diagnostic performance than simply asking the patient if he or she is able to climb two flights of stairs. Only few patients were unable to climb two flights of stairs. Those who reported not being able to climb two flights of stairs were at high risk for PPC. Within all stair-climbing parameters only self-reported poor functional capacity was selected by the LASSO regression. Self-reported poor functional capacity was an independent risk factor associated with a more than five-fold increased risk for PPC, while poor functional capacity in the stair-climbing test was not an independent predictor. Beyond this, only the type of surgery and male sex were further independent risk factors.
The cross-validated area under the receiver operating characteristic curve with biascorrected bootstrapping 95% confidence interval (95% CI) did not differ between the selfreported poor functional capacity and stair-climbing test models (0.71; 0.65-0.77 for both). Hence, our data indicate that preoperative self-reported poor functional capacity already adequately predicts PPC while preoperative stair-climbing tests did not further improve preoperative pulmonary risk assessment in patients with known or suspected COPD.
Poor functional capacity can either originate from cardiovascular comorbidities (for example, congestive heart failure) or dysfunction of the respiratory system, particularly Receiver operating characteristic (ROC) curves and ten-fold cross-validated areas under the ROC curve (cvAUC) illustrate the discriminatory capability of two multivariable models to predict postoperative pulmonary complications in patients with known or suspected COPD undergoing major surgery. Abbreviations: bc-b 95% CI: bias-corrected bootstrapped 95% confidence interval; SR PFC : self-reported poor exercise capacity; SCT: stair-climbing test.

Discussion
In this secondary analysis of a prospective observational study in patients undergoing major non-cardiac surgery, we screened a large surgical population with more than 30,000 patients to identify high-risk individuals. Exercise testing was performed in 365 individuals with known or suspected COPD.
Interestingly, our analysis demonstrates that actually performing a stair-climbing test does not translate into a better diagnostic performance than simply asking the patient if he or she is able to climb two flights of stairs. Only few patients were unable to climb two flights of stairs. Those who reported not being able to climb two flights of stairs were at high risk for PPC. Within all stair-climbing parameters only self-reported poor functional capacity was selected by the LASSO regression. Self-reported poor functional capacity was an independent risk factor associated with a more than five-fold increased risk for PPC, while poor functional capacity in the stair-climbing test was not an independent predictor. Beyond this, only the type of surgery and male sex were further independent risk factors.
The cross-validated area under the receiver operating characteristic curve with biascorrected bootstrapping 95% confidence interval (95% CI) did not differ between the selfreported poor functional capacity and stair-climbing test models (0.71; 0.65-0.77 for both). Hence, our data indicate that preoperative self-reported poor functional capacity already adequately predicts PPC while preoperative stair-climbing tests did not further improve preoperative pulmonary risk assessment in patients with known or suspected COPD.
Poor functional capacity can either originate from cardiovascular comorbidities (for example, congestive heart failure) or dysfunction of the respiratory system, particularly in COPD. Functional capacity or cardiopulmonary fitness is an important component of preoperative risk assessment [3][4][5][6].
Guidelines recommend using self-reported functional capacity for preoperative cardiovascular risk assessment [3]. Recently, a large multicenter study demonstrated that self-reported functional capacity did not improve prediction of major adverse cardiovascular events compared with clinical factors [32]. While functional capacity is an established predictor for major adverse cardiac events [3][4][5][6], still little is known about its role for prediction of PPC [10,44].
It is an established part of the preoperative routine assessment in many institutions to screen functional capacity by posing a very simple question: 'Can you climb two flights of stairs?' [3,6,44]. It was unknown if patients that answer 'no' are at increased risk for PPC or if more sophisticated diagnostic measures such as stair-climbing tests or a six-minute walk test are required for this purpose [14,17,23,24,[26][27][28][29][30]? Unfortunately, preoperative spirometry and blood gas analysis do not improve preoperative prediction of PPC in patients with known or suspected COPD [10].
The prevalence of COPD is rising rapidly and the worldwide mean prevalence has been estimated to be 13.1% [45]. While it has been recognized that COPD has great implications on patient outcome in perioperative medicine [9,11,46], still only very few studies have investigated preoperative pulmonary risk assessment in individuals with COPD [9].
Considering all this, the present study intended to evaluate if a structured quantitative assessment of the functional capacity by means of a stair-climbing test predicts PPC more accurately than a simple self-assessment of poor functional capacity gathered during preoperative consultation.
Stair-climbing tests are cumbersome, time consuming, costly, and inconvenient for the patient, as they might provoke symptoms such as dyspnea, exhaustion, dizziness, or chest pain. Moreover, stairwells are not uniform and significantly differ between buildings, regions, or countries, thus, test results are not interchangeable [19].
On the other hand, assessment of self-reported poor functional capacity preserves time, costs, and human and health-care resources. Stair-climbing tests are unfeasible in individuals with fractures of the lower limbs, certain types of disabilities, neuromuscular or rheumatic diseases or pain syndromes; however, in many of these individuals, self-reports might still be valuable. Hence, for practical reasons self-assessments might be more feasible and reproducible than exercise tests in many patients.
Beyond all this, the question remains if prediction of the PPC changes perioperative management, decision making, and finally, postoperative outcome [47]? Currently, pulmonary risk prediction tests are neither linked to specific preventive or therapeutic concepts nor provide clear therapeutic targets. Furthermore, efficiency of measures to reduce perioperative risk and to optimize the care in the perioperative period in patients with COPD, such as antiobstructive medication, preoperative pulmonary rehabilitation, choice of drugs, monitoring, or postoperative care is vague [46]. However, the predicted risk for PPC can be particularly useful for shared decision making before surgery. In this context, the estimated risk for PPC might contribute to the individual decision making to undergo surgery or not. Here, the key question is if the benefit of surgery outweighs the combined risk of surgery and anesthesia [47]?
This study has some limitations. Our data represent a single-center experience and caution should be taken extrapolating them to other institutions or different patient populations. Even though we performed a structured postoperative follow-up, assessment of postoperative complications was largely based on chart review. Further external validation could reinforce our findings.

Conclusions
COPD is associated with a high incidence of PPC. Preoperative pulmonary risk prediction is poorly defined and little is known about the role of functional capacity in these patients. Our findings demonstrate that a standard preoperative pulmonary risk assessment that includes sociodemographic data, medical history, medical preconditions, type of surgery, COPD-specific assessments, and self-reported poor functional capacity already sufficiently predicts PPC in patients with known or suspected COPD. Within all stair-climbing parameters, LASSO regression exclusively selected self-reported poor functional capacity. Self-reported poor exercise capacity was an independent predictor for PPC in the final multivariable model but not poor exercise capacity assessed by stair-climbing tests. In addition to the standard clinical assessment, preoperative stair-climbing tests did not achieve better diagnostic performance than a simple self-report of a poor functional capacity in these patients and are therefore dispensable. The time, costs, human, and health-care resources for a stair-climbing test could be better spent.